Skip to content

feat: init model experiments, add doc reader#38

Merged
leomaurodesenv merged 6 commits intomainfrom
feat/model-experiments
Jul 30, 2025
Merged

feat: init model experiments, add doc reader#38
leomaurodesenv merged 6 commits intomainfrom
feat/model-experiments

Conversation

@leomaurodesenv
Copy link
Owner

feat(experiments): Add document reader experiments

This pull request introduces a new framework for conducting and evaluating document reader experiments. This allows for standardized testing of various extractive Question Answering models on the QASports dataset and other popular QA datasets, leveraging the Haystack framework.

Key Changes:

  • New Experiments Module:

    • Adds a new experiments directory containing the logic for running the evaluations.
    • experiments/doc_reader.py: A script to set up and run a Haystack pipeline for document reading tasks. It evaluates models and prints performance metrics.
    • experiments/module.py: A supporting module that encapsulates dataset handling and model selection. It includes:
      • Enums for Dataset, DocReader, and Sports.
      • Abstract and concrete classes for loading and preparing datasets like QASports, SQuAD, AdversarialQA, and DuoRC.
      • Integration with the leomaurodesenv/QASports2 dataset from the Hugging Face Hub.
  • Dynamic Experiment Configuration:

    • The doc_reader.py script now uses argparse to allow dynamic selection of the dataset, model, and specific sport (for QASports) via command-line arguments. This enhances flexibility and reproducibility.
  • Improved Documentation:

    • The main README.md has been updated with a new "Performing Experiments" section, providing clear instructions on how to set up the environment and run the new experiment scripts.
    • Docstrings within the new modules have been improved for better code clarity.

How to Run the Experiments:

To run the new experiments, follow these steps:

# Setup `uv` in your machine
# https://github.com/astral-sh/uv
# Installing packages
$ uv sync

# See available options for the document reader experiment
$ uv run -m experiments.doc_reader --help

@leomaurodesenv leomaurodesenv self-assigned this Jul 30, 2025
@leomaurodesenv leomaurodesenv added the enhancement New feature or request label Jul 30, 2025
@leomaurodesenv leomaurodesenv merged commit 900012f into main Jul 30, 2025
1 check passed
@leomaurodesenv leomaurodesenv deleted the feat/model-experiments branch July 30, 2025 23:46
leomaurodesenv added a commit that referenced this pull request Aug 1, 2025
* feat(experiments): add document reader

* fix(experiments): huggingface datasets

* feat(experiments): add module and add qasports2

* fix(experiments): improve docstring

* feat(experiments): add argparser to doc reader

* feat(readme): add experiments section
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant